Semi-supervised Clustering for Word Instances and Its Effect on Word Sense Disambiguation

نویسندگان

  • Kazunari Sugiyama
  • Manabu Okumura
چکیده

We propose a supervised word sense disambiguation (WSD) system that uses features obtained from clustering results of word instances. Our approach is novel in that we employ semi-supervised clustering that controls the fluctuation of the centroid of a cluster, and we select seed instances by considering the frequency distribution of word senses and exclude outliers when we introduce “must-link” constraints between seed instances. In addition, we improve the supervised WSD accuracy by using features computed from word instances in clusters generated by the semi-supervised clustering. Experimental results show that these features are effective in improving WSD accuracy.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Semi-supervised Learning by Fuzzy Clustering and Ensemble Learning

This paper proposes a semi-supervised learning method using Fuzzy clustering to solve word sense disambiguation problems. Furthermore, we reduce side effects of semi-supervised learning by ensemble learning. We set classes for labeled instances. The -th labeled instance is used as the prototype of the -th class. By using Fuzzy clustering for unlabeled instances, prototypes are moved to more sui...

متن کامل

A Semi-Supervised Feature Clustering Algorithm with Application to Word Sense Disambiguation

In this paper we investigate an application of feature clustering for word sense disambiguation, and propose a semisupervised feature clustering algorithm. Compared with other feature clustering methods (ex. supervised feature clustering), it can infer the distribution of class labels over (unseen) features unavailable in training data (labeled data) by the use of the distribution of class labe...

متن کامل

Word Sense Induction and Disambiguation Rivaling Supervised Methods

Word Sense Disambiguation (WSD) aims to determine the meaning of a word in context and successful approaches are known to benefit many applications in Natural Language Processing. Although, supervised learning has been shown to provide superior WSD performance, current sense-annotated corpora do not contain a sufficient number of instances per word type to train supervised systems for all words...

متن کامل

Word Sense Disambiguation by Semi-supervised Learning

In this paper we propose to use a semi-supervised learning algorithm to deal with word sense disambiguation problem. We evaluated a semi-supervised learning algorithm, local and global consistency algorithm, on widely used benchmark corpus for word sense disambiguation. This algorithm yields encouraging experimental results. It achieves better performance than orthodox supervised learning algor...

متن کامل

Word Sense Disambiguation in Hindi Language Using Hyperspace Analogue to Language and Fuzzy C-Means Clustering

The problem of Word Sense Disambiguation (WSD) can be defined as the task of assigning the most appropriate sense to the polysemous word within a given context. Many supervised, unsupervised and semi-supervised approaches have been devised to deal with this problem, particularly, for the English language. However, this is not the case for Hindi language, where not much work has been done. In th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009